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The Project Cetacean Translation Initiative (CETI) is using machine learning to try to understand the vocalizations of sperm 
whales. Credit: Franco Banfi/Minden Pictures 
nderneath the thick forest canopy on a remote island in the South Pacific, a 
New Caledonian Crow peers from its perch, dark eyes glittering. The bird 


carefully removes a branch, strips off unwanted leaves with its bill and fashions 


a hook from the wood. The crow is a perfectionist: if it makes an error, it will scrap the 


whole thing and start over. When it's satisfied, the bird pokes the finished utensil into a 
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crevice in the tree and fishes out a wriggling grub. 


The New Caledonian Crow is one of the only birds known to manufacture tools, a skill once 
thought to be unique to humans. Christian Rutz, a behavioral ecologist at the University of 
St Andrews in Scotland, has spent much of his career studying the crow's capabilities. The 
remarkable ingenuity Rutz observed changed his understanding of what birds can do. He 
started wondering if there might be other overlooked animal capacities. The crows live in 
complex social groups and may pass toolmaking techniques on to their offspring. 
Experiments have also shown that different crow groups around the island have distinct 
vocalizations. Rutz wanted to know whether these dialects could help explain cultural 


differences in toolmaking among the groups. 


New technology powered by artificial intelligence is poised to provide exactly these kinds 
of insights. Whether animals communicate with one another in terms we might be able to 
understand is a question of enduring fascination. Although people in many Indigenous 
cultures have long believed that animals can intentionally communicate, Western 
scientists traditionally have shied away from research that blurs the lines between humans 
and other animals for fear of being accused of anthropomorphism. But with recent 
breakthroughs in AI, “people realize that we are on the brink of fairly major advances in 


regard to understanding animals' communicative behavior,” Rutz says. 


Beyond creating chatbots that woo people and producing art that wins fine-arts 


of artificial-intelligence scientists, biologists and conservation experts is collecting a wide 
range of data from a variety of species and building machine-learning models to analyze 
them. Other groups such as the Project Cetacean Translation Initiative (CETI) are focusing 


on trying to understand a particular species, in this case the sperm whale. 
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Decoding animal vocalizations could aid conservation and welfare efforts. It could also 
have a startling impact on us. Raskin compares the coming revolution to the invention of 
the telescope. “We looked out at the universe and discovered that Earth was not the 
center,” he says. The power of AI to reshape our understanding of animals, he thinks, will 
have a similar effect. “These tools are going to change the way that we see ourselves in 


relation to everything.” 


When Shane Gero got off his research vessel in Dominica after a recent day of fieldwork, 
he was excited. The sperm whales that he studies have complex social groups, and on this 
day one familiar young male had returned to his family, providing Gero and his colleagues 


with an opportunity to record the group's vocalizations as they reunited. 


For nearly 20 years Gero, a scientist in residence at Carleton University in Ottawa, kept 


capturing their clicking vocalizations and what the animals were doing when they made 
them. He found that the whales seemed to use specific patterns of sound, called codas, to 
identify one another. They learn these codas much the way toddlers learn words and 


names, by repeating sounds the adults around them make. 


Having decoded a few of these codas manually, Gero and his colleagues began to wonder 
whether they could use AI to speed up the translation. As a proof of concept, the team fed 
some of Gero's recordings to a neural network, an algorithm that learns skills by analyzing 
data. It was able to correctly identify a small subset of individual whales from the codas 99 
percent of the time. Next the team set an ambitious new goal: listen to large swathes of the 
ocean in the hopes of training a computer to learn to speak whale. Project CETI, for which 
Gero serves as lead biologist, plans to deploy an underwater microphone attached to a 


buoy to record the vocalizations of Dominica's resident whales around the clock. 
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As sensors have gotten cheaper and technologies such as hydrophones, biologgers and 
drones have improved, the amount of animal data has exploded. There's suddenly far too 
much for biologists to sift through efficiently by hand. AI thrives on vast quantities of 
information, though. Large language models such as ChatGPT must ingest massive 
amounts of text to learn how to respond to prompts: ChatGPT-3 was trained on around 45 


terabytes of text data, a good chunk of the entire Library of Congress. Early models 
required humans to classify much of those data with labels. In other words, people had to 
teach the machines what was important. But the next generation of models learned how to 
“self-supervise,” automatically learning what's essential and independently creating an 


algorithm of how to predict what words come next in a sequence. 


In 2017 two research groups discovered a way to translate between human languages 
without the need for a Rosetta stone. The discovery hinged on turning the semantic 
relations between words into geometric ones. Machine-learning models are now able to 
translate between unknown human languages by aligning their shapes—using the 
frequency with which words such as “mother” and “daughter” appear near each other, for 
example, to accurately predict what comes next. “There's this hidden underlying structure 
that seems to unite us all,” Raskin says. “The door has been opened to using machine 


learning to decode languages that we don't already know how to decode.” 


The field hit another milestone in 2020, when natural-language processing began to be 
able to “treat everything as a language,” Raskin explains. Take, for example, DALL-E 2, 
one of the AI systems that can generate realistic images based on verbal descriptions. It 
maps the shapes that represent text to the shapes that represent images with remarkable 
accuracy—exactly the kind of “multimodal” analysis the translation of animal 


communication will probably require. 


Many animals use different modes of communication simultaneously, just as humans use 
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body language and gestures while talking. Any actions made immediately before, during, 
or after uttering sounds could provide important context for understanding what an 
animal is trying to convey. Traditionally, researchers have cataloged these behaviors in a 
list known as an ethogram. With the right training, machine-learning models could help 
parse these behaviors and perhaps discover novel patterns in the data. Scientists writing in 
the journal Nature Communications last year, for example, reported that a model found 
previously unrecognized differences in Zebra Finch songs that females pay attention to 
when choosing mates. Females prefer partners that sing like the birds the females grew up 


with. 


You can already use one kind of Al-powered analysis with Merlin, a free app from the 
Cornell Lab of Ornithology that identifies bird species. To identify a bird by sound, Merlin 
takes a user's recording and converts it into a spectrogram—a visualization of the volume, 
pitch and length of the bird's call. The model is trained on Cornell's audio library, against 
which it compares the user's recording to predict the species identification. It then 
compares this guess to eBird, Cornell's global database of observations, to make sure it's a 
species that one would expect to find in the user's location. Merlin can identify calls from 


more than 1,000 bird species with remarkable accuracy. 


But the world is loud, and singling out the tune of one bird or whale from the cacophony is 
difficult. The challenge of isolating and recognizing individual speakers, known as the 
cocktail party problem, has long plagued efforts to process animal vocalizations. In 2021 
the Earth Species Project built a neural network that can separate overlapping animal 
sounds into individual tracks and filter background noise, such as car honks—and it 
released the open-source code for free. It works by creating a visual representation of the 
sound, which the neural network uses to determine which pixel is produced by which 
speaker. In addition, the Earth Species Project recently developed a so-called foundational 


model that can automatically detect and classify patterns in datasets. 
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New Caledonian Crows, which are famous for their toolmaking abilities, have regionally distinctive 
vocalizations that could one day be deciphered using Al. Credit: Jean-Paul Ferrero/Auscape 
International Pty Ltd/Alamy Stock Photo 


Not only are these tools transforming research, but they also have practical value. If 
scientists can translate animal sounds, they may be able to help imperiled species. The 
Hawaiian Crow, known locally as the ‘Alala, went extinct in the wild in the early 2000s. 
The last birds were brought into captivity to start a conservation breeding program. 
Expanding on his work with the New Caledonian Crow, Rutz is now collaborating with the 
Earth Species Project to study the Hawaiian Crow's vocabulary. “This species has been 
removed from its natural environment for a very long time,” he says. He is developing an 
inventory of all the calls the captive birds currently use. He'll compare that to historical 
recordings of the last wild Hawaiian Crows to determine whether their repertoire has 
changed in captivity. He wants to know whether they may have lost important calls, such 
as those pertaining to predators or courtship, which could help explain why reintroducing 


the crow to the wild has proved so difficult. 


Machine-learning models could someday help us figure out our pets, too. For a long time 
animal behaviorists didn't pay much attention to domestic pets, says Con Slobodchikoff, 
author of Chasing Doctor Dolittle: Learning the Language of Animals. When he began his 
career studying prairie dogs, he quickly gained an appreciation for their sophisticated 
calls, which can describe the size and shape of predators. That experience helped to inform 
his later work as a behavioral consultant for misbehaving dogs. He found that many of his 
clients completely misunderstood what their dog was trying to convey. When our pets try 
to communicate with us, they often use multimodal signals, such as a bark combined with 
a body posture. Yet “we are so fixated on sound being the only valid element of 


communication, that we miss many of the other cues,” he says. 
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Now Slobodchikoff is developing an AI model aimed at translating a dog's facial 
expressions and barks for its owner. He has no doubt that as researchers expand their 
studies to domestic animals, machine-learning advances will reveal surprising capabilities 


in pets. “Animals have thoughts, hopes, maybe dreams of their own,” he says. 


Farmed animals could also benefit from such depth of understanding. Elodie F. Briefer, an 
associate professor in animal behavior at the University of Copenhagen, has shown that 
it's possible to assess animals' emotional states based on their vocalizations. She recently 
created an algorithm trained on thousands of pig sounds that uses machine learning to 
predict whether the animals were experiencing a positive or negative emotion. Briefer says 


a better grasp of how animals experience feelings could spur efforts to improve their 


welfare. 


But as good as language models are at finding patterns, they aren't actually deciphering 
meaning—and they definitely aren't always right. Even AI experts often don't understand 
how algorithms arrive at their conclusions, making them harder to validate. Benjamin 
Hoffman, who helped to develop the Merlin app before joining the Earth Species Project, 
says that one of the biggest challenges scientists now face is figuring out how to learn from 


what these models discover. 


“The choices made on the machine-learning side affect what kinds of scientific questions 
we can ask,” Hoffman says. Merlin Sound ID, he explains, can help detect which birds are 
present, which is useful for ecological research. It can't, however, help answer questions 
about behavior, such as what types of calls an individual bird makes when it interacts with 
a potential mate. In trying to interpret different kinds of animal communication, Hoffman 
says researchers must also “understand what the computer is doing when it's learning how 
to do that.” 
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aniela Rus, director of the Massachusetts Institute of Technology Computer 

Science and Artificial Intelligence Laboratory, leans back in an armchair in her 

office, surrounded by books and stacks of papers. She is eager to explore the 
new possibilities for studying animal communication that machine learning has opened 
up. Rus previously designed remote-controlled robots to collect data for whale-behavior 
research in collaboration with biologist Roger Payne, whose recordings of humpback 
whale songs in the 1970s helped to popularize the Save the Whales movement. Now Rus is 
bringing her programming experience to Project CETI. Sensors for underwater monitoring 
have rapidly advanced, providing the equipment necessary to capture animal sounds and 
behavior. And AI models capable of analyzing those data have improved dramatically. But 


until recently, the two disciplines hadn't been joined. 


At Project CETI, Rus's first task was to isolate sperm whale clicks from the background 
noise of the ocean realm. Sperm whales' vocalizations were long compared to binary code 
in the way that they represent information. But they are more sophisticated than that. 
After she developed accurate acoustic measurements, Rus used machine learning to 
analyze how these clicks combine into codas, looking for patterns and sequences. “Once 
you have this basic ability,” she says, “then we can start studying what are some of the 
foundational components of the language.” The team will tackle that question directly, Rus 


says, “analyzing whether the [sperm whale] lexicon has the properties of language or not.” 


But grasping the structure of a language is not a prerequisite to speaking it—not anymore, 
anyway. It's now possible for AI to take three seconds of human speech and then hold 
forth at length with its same patterns and intonations in an exact mimicry. In the next year 
or two, Raskin predicts, “we'll be able to build this for animal communication.” The Earth 
Species Project is already developing AI models that emulate a variety of species, with the 
aim of having “conversations” with animals. He says two-way communication will make it 


that much easier for researchers to infer the meaning of animal vocalizations. 
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In collaboration with outside biologists, the Earth Species Project plans to test playback 
experiments, playing an artificially generated call to Zebra Finches in a laboratory setting 
and then observing how the birds respond. Soon “we'll be able to pass the finch, crow or 
whale Turing test,” Raskin asserts, referring to the point at which the animals won't be 
able to tell they are conversing with a machine rather than one of their own. “The plot 


twist is that we will be able to communicate before we understand.” 


The prospect of this achievement raises ethical concerns. Karen Bakker, a digital 
innovations researcher and author of The Sounds of Life: How Digital Technology Is 
Bringing Us Closer to the Worlds of Animals and Plants, explains that there may be 
unintended ramifications. Commercial industries could use AI for precision fishing by 
listening for schools of target species or their predators; poachers could deploy these 
techniques to locate endangered animals and impersonate their calls to lure them closer. 
For animals such as humpback whales, whose mysterious songs can spread across oceans 
with remarkable speed, the creation of a synthetic song could, Bakker says, “inject a viral 


meme into the world's population” with unknown social consequences. 


So far the organizations at the leading edge of this animal-communication work are 
nonprofits like the Earth Species Project that are committed to open-source sharing of 
data and models and staffed by enthusiastic scientists driven by their passion for the 
animals they study. But the field might not stay that way—profit-driven players could 
misuse this technology. In a recent article in Science, Rutz and his co-authors noted that 
“best-practice guidelines and appropriate legislative frameworks” are urgently needed. 
“It's not enough to make the technology,” Raskin warns. “Every time you invent a 


technology, you also invent a responsibility.” 


Designing a “whale chatbot,” as Project CETI aspires to do, isn't as simple as figuring out 


how to replicate sperm whales' clicks and whistles; it also demands that we imagine an 
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animal's experience. Despite major physical differences, humans actually share many 
basic forms of communication with other animals. Consider the interactions between 
parents and offspring. The cries of mammalian infants, for example, can be incredibly 
similar, to the point that white-tailed deer will respond to whimpers whether they're made 


by marmots, humans or seals. Vocal expression in different species can develop similarly, 


too. Like human babies, harbor seal pups learn to change their pitch to target a parent's 


sequence of syllables learned from a tutor,” explains Johnathan Fritz, a research scientist 


at the University of Maryland's Brain and Behavior Initiative. 


Whether animal utterances are comparable to human language in terms of what they 
convey remains a matter of profound disagreement, however. “Some would assert that 
language is essentially defined in terms that make humans the only animal capable of 
language,” Bakker says, with rules for grammar and syntax. Skeptics worry that treating 


animal communication as language, or attempting to translate it, may distort its meaning. 


Raskin shrugs off these concerns. He doubts animals are saying “pass me the banana,” but 
he suspects we will discover some basis for communication in common experiences. “It 
wouldn't surprise me if we discovered [expressions for] ‘grief’ or ‘mother’ or ‘hungry’ 
across species,” he says. After all, the fossil record shows that creatures such as whales 
have been vocalizing for tens of millions of years. “For something to survive a long time, it 


has to encode something very deep and very true.” 


Ultimately real translation may require not just new tools but the ability to see past our 
own biases and expectations. Last year, as the crusts of snow retreated behind my house, a 
pair of Sandhill Cranes began to stalk the brambles. A courtship progressed, the male 
solicitous and preening. Soon every morning one bird flapped off alone to forage while the 


other stayed behind to tend their eggs. We fell into a routine, the birds and I: as the sun 
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crested the hill, I kept one eye toward the windows, counting the days as I imagined cells 


dividing, new wings forming in the warm, amniotic dark. 


Then one morning it ended. Somewhere behind the house the birds began to wail, twining 
their voices into a piercing cry until suddenly I saw them both running down the hill into 
the stutter start of flight. They circled once and then disappeared. I waited for days, but I 


never saw them again. 


Wondering if they were mourning a failed nest or whether I was reading too much into 
their behavior, I reached out to George Happ and Christy Yuncker, retired scientists who 
for two decades shared their pond in Alaska with a pair of wild Sandhill Cranes they 
nicknamed Millie and Roy. They assured me that they, too, had seen the birds react to 
death. After one of Millie and Roy's colts died, Roy began picking up blades of grass and 
dropping them near his offspring's body. That evening, as the sun slipped toward the 
horizon, the family began to dance. The surviving colt joined its parents as they wheeled 


and jumped, throwing their long necks back to the sky. 


Happ knows critics might disapprove of their explaining the birds' behaviors as grief, 
considering that “we cannot precisely specify the underlying physiological correlates.” But 
based on the researchers’ close observations of the crane couple over a decade, he writes, 
interpreting these striking reactions as devoid of emotion “flies in the face of the 


evidence.” 


Everyone can eventually relate to the pain of losing a loved one. It's a moment ripe for 


translation. 


Perhaps the true value of any language is that it helps us relate to others and in so doing 


frees us from the confines of our own minds. Every spring, as the light swept back over 
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Yuncker and Happ's home, they waited for Millie and Roy to return. In 2017 they waited in 
vain. Other cranes vied for the territory. The two scientists missed watching the colts hatch 
and grow. But last summer a new crane pair built a nest. Before long, their colts peeped 
through the tall grass, begging for food and learning to dance. Life began a new cycle. 


“We're always looking at nature,” Yuncker says, “when really, we're part of it.” 
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